Dependency Analysis of Scrambled References for Better Evaluation of Japanese Translation

نویسندگان

  • Hideki Isozaki
  • Natsume Kouchi
چکیده

In English-to-Japanese translation, BLEU (Papineni et al., 2002), the de facto standard evaluation metric for machine translation (MT), has very weak correlation with human judgments (Goto et al., 2011; Goto et al., 2013). Therefore, RIBES (Isozaki et al., 2010; Hirao et al., 2014) was proposed. RIBES measures similarity of the word order of a machine-translated sentence and that of a corresponding human-translated reference sentence. RIBES has much stronger correlation than BLEU but most Japanese sentences have alternative word orders (scrambling), and one reference sentence is not sufficient for fair evaluation. Isozaki et al. (2014) proposed a solution to this problem. This solution generates semantically equivalent word orders of reference sentences. Automatically generated word orders are sometimes incomprehensible or misleading, and they introduced a heuristic rule that filters out such bad sentences. However, their rule is too conservative and generated alternative word orders for only 30% of reference sentences. In this paper, we present a rule-free method that uses a dependency parser to check scrambled sentences and generated alternatives for 80% of sentences. The experimental results show that our method improves sentence-level correlation with human judgments. In addition, strong system-level correlation of single reference RIBES is not damaged very much. We expect this method can be applied to other languages such as German, Korean, ∗This work was done while the second author was a graduate student of Okayama Prefectural University. Spearman’s ρ with adequacy NTCIR-7 JE RIBES JE BLEU NTCIR-9 JE RIBES JE BLEU EJ RIBES EJ BLEU NTCIR-10 JE RIBES JE BLEU EJ RIBES EJ BLEU 0.0 0.2 0.4 0.6 0.8 1.0 Figure 1: RIBES has better correlation with adequacy than BLEU (system-level correlation) Turkish, Hindi, etc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word-based Japanese typed dependency parsing with grammatical function analysis

We present a novel scheme for wordbased Japanese typed dependency parser which integrates syntactic structure analysis and grammatical function analysis such as predicate-argument structure analysis. Compared to bunsetsu-based dependency parsing, which is predominantly used in Japanese NLP, it provides a natural way of extracting syntactic constituents, which is useful for downstream applicatio...

متن کامل

Dependency-based Automatic Enumeration of Semantically Equivalent Word Orders for Evaluating Japanese Translations

Scrambling is acceptable reordering of verb arguments in languages such as Japanese and German. In automatic evaluation of translation quality, BLEU is the de facto standard method, but BLEU has only very weak correlation with human judgements in case of Japanese-toEnglish/English-to-Japanese translations. Therefore, alternative methods, IMPACT and RIBES, were proposed and they have shown much ...

متن کامل

A Dependency-to-String Model for Chinese-Japanese SMT System

This paper describes the Beijing Jiaotong University Chinese-Japanese machine translation system which participated in the 2st Workshop on Asian Translation (WAT2015). We exploit the syntactic and semantic knowledge encoded in dependency tree to build a dependency-to-string translation model for Chinese-Japanese statistical machine translation (SMT). Our system achieves a BLEU of 34.87 and a RI...

متن کامل

System Description: Dependency-based Pre-ordering for Japanese-Chinese Machine Translation

This paper describes the Beijing Jiaotong University Japanese-Chinese machine translation system which participated in the 1st Workshop on Asian Translation (WAT 2014). We propose a preordering approach based on dependency parsing for Japanese-Chinese statistical machine translation (SMT). Our system achieves a BLEU of 24.12 and a RIBES of 79.48 on the Japanese-Chinese translation task in the o...

متن کامل

RED: A Reference Dependency Based MT Evaluation Metric

Most of the widely-used automatic evaluation metrics consider only the local fragments of the references and translations, and they ignore the evaluation on the syntax level. Current syntaxbased evaluation metrics try to introduce syntax information but suffer from the poor parsing results of the noisy machine translations. To alleviate this problem, we propose a novel dependency-based evaluati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015